Entry Name: TUE-Elzen-MC2

VAST Challenge 2014
Mini-Challenge 2

 

 

Team Members:

 

Stef van den Elzen

Eindhoven University of Technology and SynerScope B.V.

s.j.v.d.elzen@tue.nl

PRIMARY

 

Paul van der Corput

Eindhoven University of Technology

p.n.a.v.d.corput@tue.nl

 

Martijn van Dortmont

Eindhoven University of Technology and SynerScope B.V.

m.a.m.m.v.dortmont@tue.nl

 

Roeland Scheepens

Eindhoven University of Technology

r.j.scheepens@tue.nl

 

Kasper Dinkla,

Eindhoven University of Technology

k.dinkla@tue.nl

Student Team: YES

 

Analytic Tools Used:

Custom spatiotemporal visual analysis tool developed by team members

Custom matrix visual analysis tool (credit matrix) developed by team members

SynerScope Marcato (http://www.synerscope.com/)

 

Approximately how many hours were spent working on this submission in total?

240 hours

 

May we post your submission in the Visual Analytics Benchmark Repository after VAST Challenge 2014 is complete?

YES

 

Video: TUE-ELZEN-MC2.wmv 


 

 

 

Questions

 

MC2.1Describe common daily routines for GAStech employees. What does a day in the life of a typical GAStech employee look like?  Please limit your response to no more than five images and 300 words.

 

Based on the GPS data combined with the credit card transaction data, the average day of a GAStech employee seems to consist of going to the GAStech office at around 8-9AM each workday. This is usually followed by them going out to lunch at around 12PM, some preferring the same lunch location each day, while others vary their choice of location. After about an hour they go back to work and stay there until they go home at around 5-6PM. Many then go out to dinner/drinks after which they go home again later in the evening.

This pattern seems to repeat over the course of the workweek for many of the regular employees (see Figures 1 and 2).

 

Figure 1: Typical representational workday of GAStech employee Calixto, Nils.

Figure 2: Small multiple detail visualization of typical workday of GAStech employee Calixto, Nils.

 

It should also be noted that truck drivers have a notably different daily routine (see Figure 3). They travel to various locations around town including GAStech. It is also interesting to note that the truck drivers and the remaining GAStech employees appear to have very little contact outside of GAStech. This is backed up by the lack of contact in the credit matrix (see Section MC2.2).

 

Figure 3: Truck drivers routes compared to non-truck drivers routes.

 

MC2.2Identify up to twelve unusual events or patterns that you see in the data. If you identify more than twelve patterns during your analysis, focus your answer on the patterns you consider to be most important for further investigation to help find the missing staff members. For each pattern or event you identify, describe

a.      What is the pattern or event you observe?

b.      Who is involved?

c.       What locations are involved?

d.      When does the pattern or event take place?

e.       Why is this pattern or event significant?

f.        What is your level of confidence about this pattern or event?  Why?

 

Please limit your answer to no more than twelve images and 1500 words.

 

 

Several patterns were discovered by matching credit card transactions between pairs of GASTech employees and visualizing them as a clustered heat map (see figure below). Two transactions match when they occur at the same establishment and within a (user configurable) time span from each other. Moreover, transactions can be filtered by daily time span (e.g. narrowing transactions down to evenings) and global time span (e.g. narrowing transactions down to the last days).

Credit card transaction patterns:

What:              Lone truck drivers
Who:              
All truck drivers
Where:           
Mostly airports and manufactures
When:            
Entire time period
Significance:  
Little, but does show the deviation (and possible isolation) of the truck drivers
                        from typical employees

Confidence:   
High, this patterns applies to all truck drivers

What:              Engineers stick together and have a tight coffee schedule
Who:              
Most of the engineering department, but F. Balas, G. Cazar, V. Frente, A. Calzas,
                        and L. Azada in particular
Where:           
Mostly the `Bean there done that' establishment
When:            
12:00 sharp
Significance:  
Little
Confidence:   
High, consistent across most engineers

What:              The security department sticks together, but also includes other employees
Who:              
The security department and D. Coginion, L. Lagos, R. Mies Haber, S. Flecha, C.
                        Lais, M. Bramar, and B. Tempestad
Where:           
Mostly Guy's Giros, Brew've been served, and Hippokampos
When:            
Around breakfast and dinner time
Significance:  
Medium, provides connections between already suspect security department
                        and other GASTech employees. The connection to R. Mies Haber is of particular
                        interest, because she is a likely POK relative.
Confidence:   
High, many consistent transactions

What:              Couple of employees (male and female, same age) visit a hotel in the afternoon
Who:              
B. Tempestad and I. Borrasca
Where:           
Chostus Hotel
When:            
January the 10th, 14th, and 17th, around 13:30
Significance:  
Little
Confidence:   
High, there are three separate visits and transaction costs around 100

Spatiotemporal patterns:

What:              Security employees appear to be living close toghether.
Who:              
I. Ferro, L. Bodrogi, I. Vann, H. Osvaldo
Where:           
South-east Abila
When:            
N/A
Significance:  
High, there must be close contacts between them.
Confidence:   
High, derived from track data.

What:              Security employees visit consistently 5 different places (A-E) in different compositions
                        that no other GAStech employee visits.
Who:              
I. Ferro, L. Bodrogi, I. Vann, H. Osvaldo, M.Mies
Where:           
see Figure below
When:            
January, 7, 8, 9, 10, 11, 14, 15, 17, 18, consistently between 12:30 and 13:30
Significance:  
High, what do they do there?
Confidence:   
High, cross-referenced with credit-card data

What:              Security employees visit (as only employees) the houses of the executives
Who:              
I. Ferro, L. Bodrogi, I. Vann, H. Osvaldo, M.Mies
Where:           
see Figure above
Significance:  
High, why do they visit them?
Confidence:   
High, no other GAStech employee pays a visit.

What:              GPS signal noise
Who:              
E. Orilla
Where:           
Everywhere
When:            
Always
Significance:  
Little

Confidence:   
High

What:              GPS signal noise
Who:              
A. Calzas
Where:           
Mostly in north Abila
When:            
Always
Significance:  
Little

Confidence:   
High

What:              Executives play golf on sunday
Who:              
Sanjorge, Vasco-Pais, Barranco, Strum, Campo-Corrente
Where:           
Golf course (North Abila)
When:            
Sundays
Significance:  
Little
, however, no-one else plays golf.
Confidence:   
High.

What:              Missing data and living place for Sanjorge (always stays at a hotel)
Who:              
S. Sanjorge Jr.
Where:           
Chostus Hotel

When:            
Always
Significance:  
Little
Confidence:   
High

 

MC2.3Like most datasets, the data you were provided is imperfect, with possible issues such as missing data, conflicting data, data of varying resolutions, outliers, or other kinds of confusing data.  Considering MC2 data is primarily spatiotemporal, describe how you identified and addressed the uncertainties and conflicts inherent in this data to reach your conclusions in questions MC2.1 and MC2.2.  Please limit your response to no more than five images and 300 words.

 

For answering the two questions of this mini challenge, we certainly had to take uncertainties and conflicts in the spatiotemporal data into account. Below is an analysis:

From this analysis we know that an interpretation of the data may cause certain observations to be invalid. Here is how we addressed these potential issues that come with the data. We started with visualizing all track data and looked for anomalies, e.g, whether (part of the) tracks are missing at one or more days. If so, then we could investigate those occurrences further with creditcard data (see Figures below for an example). Credit card data could also be used to check for timers that are out of sync, by comparing the time of the transactions with the time interval that a car is parked.